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CONTENT HOSTING ENVIRONMENT SYSTEM AND CACHE MECHANISM 



Technical Field 

The present invention relates to a system for storing and operating upon data and, in 
5 particular, a system providing a content dependent environment wherein the functions and 

capabilities of the system and the functional elements thereof for storing and operating upon data 
are dependent upon and determined by the content of the data. 
Background Art 

A significant and primary historic trend in computer systems has been the continuous 

10 development of ever more powerful and larger general purpose processors and memories at 

progressively lower costs. A resultant parallel trend has been to implement all possible functions 
or capabilities that may be needed or desired by users on general purpose computers, with the 
specific functions or capabilities provided to the users being determined by the applications 
programs running on the general purpose computers. As a consequence, it is rare to find a 

1 5 system, regardless of purpose or function, that is not based upon a general purpose processor 
executing a general purpose, full function operating system with one or more applications 
programs, which are in turn generally as broadly functional as possible, providing the specific 
functionality desired by the user. Examples of such are as diverse as database systems, word 
processing and graphics creation and editing systems, environmental control and production 

20 . process systems, switched communications networks, internet servers, mass data storage 
systems, and so on. , ; 1 • • ,a ^ •-. 

Such systems based upon general, purpose processors and general purpose operating 
systems are generally regarded as advantageous because the functions and capabilities of the 
systems may be modified or expanded merely by adding or modifying the applications programs 

25 running on the systems. A recurring problem, with such systems for many users, however, is that 
not all users require or are interested in the full range of functions and capabilities that may be 
provided by such systems, but require or are interested in only a specific, limited and well 
defined set of capabilities or functions. . 

For example, many users are interested in and need only a basic word processing program 

30 capable of creating and editing text, but without extensive document foimatting capabilities, as 
may be used for presentations or elaborate newsletters and the like, or the capability to 
incorporate spreadsheets, graphics, database information, and so on, from other applications 
programs. One prominent New England writer, for example, is known to have written many best 
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selling books on a basic, single function v/ord processing system having only the insert, 
overstrike, delete^ copy, cut and paste functions, and is rumored to continue to write on that 
system. Typically, however, most single function systems are of little use for serious work. For 
example, most dedicated word processing systems are little more than elaborated typewriters 
5 without the functionality necessary for extensive, serious writing tasks. 

As a consequence, most users are forced to accept the cost and complexity of a full 
function, general purpose system executing a full function; general purpose operating system, and 
applications programs of the corresponding- complexity necessary to operate on 'a general purpose 
processor and operating system* solely to perform a single,, well defined and limited* function or 

1 0 task. It must also be noted that the cost, and; complexity of using a foil function, generakpurpose 
system for a single task is not limited only to ;the r cost aid complexity of the hardware and 
software, bin often includes the;need for expert,.;highly trained.operators ;tb -operate and maintain 
the system. . r .- • . . vy • v.,,-- .; ;_• ^ .o:. > v.ir. * it i.v < .\ 

Another persistent problem inf computervsystenis is in the.variouS' methods by which the 

15 systems store data and, most particularly^ in the methodsmsed to provide faster access to at least 
selected portions of the data, such as the var ious forms of caching mechanisms. As is well known 
and understood by those of ordinary skill in the relevant arts, the bulk of the data residing in a 
system is typically stored in relatively slow mass storage devices, such as disk drives and large 
but relatively slow memories. Selected portions of the data, however, are stored in a cache 

20 mechanism that is interposed between the mass storage devices and the system and that includes 
a relatively faster cache memory, so that the selected portions of the data are provided to the 
system from the faster cache memory. While caching mechanisms have been the subject of much 
development effort, however, the development efforts have largely been directed to methods for 
constructing faster caches and to methods for selecting the data to be stored in the cache memory. 

25 As a consequence, the cache mechanisms- of the prior art have been generally unsatisfactory 
because of the inherent limitations of these approaches. That is, there. are practical, physical 
limitations, on how fast a cache memory can be constructed and only so many ways, such as the 
various least recently used algorithms, to select which data should reside in the cache memory. 
For example, virtually every system, hosts files of significantly larger size or extent than 

30 the average but which are typically accessed less frequently than are smaller files, often because 
of their size rather than because, of their lack of significance. Such files are frequently encached 
by the systems of the prior art, however, because the caches are generally managed by various 
frequency of use algorithms and, although less frequently accessed, they are accessed frequently 
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enough to be encached according to the cache management algorithms commonly in use. The 
encaching of large files, however, often results in theeffective loss -of a significant part of the 
cache memory capacity that may be needed for caching smaller and less significant but more 
frequently accessed files. As a consequence, the entire sy6tem, or at least the caching mechanism, 
5 may essentially come to a halt or be (Significantly slowed because of the additional writing back, 
or flushing, and reloading of smaller, mbre frequently accessed files resulting from the use of 
cache -memory capacity for very large files. > «. ^ 

Yet anothenproblem of the^cache mechanisms of the prior art is that virtually every 
system .includes .filesi having; c&ntehts that are essential to the operation of> the ^system, or at least 

10 to theroperations of one or more^users, but Which- are 'not* accessed as frequently as less critical 
files. In the caching mechanismsfpf the pri©r-art, however'/ A^bich* typicallyi include a least 
frequently :used algorithm to determine whichufiles are encached in the cache memory and which 
files are stored in a mass storage devices, such files may be "flushed" out to the mass storage 
device and thus not beireadily^aceessibieM^^ fie^dfedij' SO thatressential functions of the system 

15 are delayed while the . files: are retrieved • 

The present invention provides a solution to these and other problems of the prior art. 
Disclosurelof the Invention * " * 

1 In a first aspect, the present invention is directed to a content' hosting system for storing 
and publishing data; that is, a system that provides an operating environment that is content 

20 sensitive and thereby -hosts data accordirig:to the content, or. type, of the data. According to the 
present invention, each functional element(of^the system and each group of cooperatively 
operating functional elements: is designedfto hbst^ a particular content, that is, a particular type or 
form of data. In a system, intended to operate with a single content, therefore, each functional 
element of the system will be designed and- optimized to store and publish data of that content, 

25 and the system_may<be expanded to host additional contents by the addition of. further functional 
elements corresponding to and designed and ^optimized for each additional content. 

In a first, embodiment; a content hosting system of the present invention includes at least 
one content delivery server* for storing and publishingdata of a corresponding- type wherein each 
content delivery server is accessible to at least one content requester for providing the data stored 

30 therein to the at least one content requester, and, in systems' having two or more content delivery 
servers, a content request director connected with each content' delivery server and accessible to 
each content requester for identifying the locations. of data in each content'delivery server. 
According to the present invention, a. content request director contains information identifying 
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the type of information stored in each contentdelivery server andthe locations of data stored in 
each content delivery server and is responsive to a request for data from a content requester for 
forwarding the request to the corresponding content delivery server, so that the content delivery 
servers appear to a content requester as >i single content delivery server containing all of the 
5 content hosted inthe content hcsting.systsm. ■■ . .- ;■ . \. 

The content, hosting system also includes a contentdelivery administrator that is 
connected with each content deli very server and with -the consent request director, if one is 
present, and;is accessible to at least one content publisher wherein the content delivery: . 
administrator is responsive to a request from a cortent publisher for a content editing operation 
10 to be performed :on data stored in acontent delivery, ser*er,;sueL as storing, editing c: iri^aging ■ 
the content, for providing:^ , . . . 

corresponding content delivery server is responsive- loAe content editing request for performing 
the requested content '.editing operation on the data: The content delivery adnnnistrator is^ also 
responsive . to, the; content requester jand-c^ 
1 5 upon the data-for managing'the stormgrbf dataiin the content delivery servers to balance the load 
of requests among the content delivery servers. , - .... 

In further aspects of the present invention, a content hosting system may include a content 
backup server connected with each content delivery serverfor storing a copy of the data stored in 
each content delivery server and for providing the stored data to the content delivery servers. 
20 In a further embodiment of the present invention^ a content hosting system operates, as a 

network server wherein the content request director is connected from :a network and to the 
content delivery servers for forwarding requests Irom content requesters connected to the 
network.to the content delivery servers while the content delivery administrator is connected 
from the content publishers and to the content delivery servers for providing access to the content 
25 delivery servers to the content publishers. •■: ... 

In a yet further embodiment, a hierarchical. content hosting system. will include a plurality 
of content hosting systems, each including at least one content delivery server for hosting content 
of a corresponding type wherein each content delivery server is accessible to at least one content 
requester for;proyiding the content stored therein to the at least one content requester, and, in , 
30 systems having two or more content delivery servers, a content request director connected with 
each content delivery server and accessible to each content requester for forwarding requests 
from content requesters to the content delivery servers, and a content delivery administrator 
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connected with each content delivery server and the content request director arid accessible to at 
least one content publisher. *.••■. -■ , 

The hierarchical content hosting system will generally be organized as a tree structure, 
with a content hosting system occupying each node of the tree wherein; the content request 
5 director of each content hosting system is associated with and contains information identifying 
the locations of content in the associated content hosting system. At least some of the content 
hosting systems of a hierarchical content hosting system are capable of acting as content - 
distributors j. wherein a;content distributor as capable of replicating- content to the content hosting 
systems located on the . nodes of the branches descending 'from t the content distributor system. As 

10 such;, content published in, one eContent, hosting; system may-be replibated freift acontent ; : " 

distributing hosting system to' the; coIrtent hDSH^^J^^t^siocated at the nodes of the branches ' *• 
descending, from that content hosting system: < ^ ^ r n . . --i i, *>; .. - 5 ' ■: - 
i.. .'.Insanothe^aape.ot, .the present invention 1 is directed to a cache mechanism for use in a data' 
processing system^ucbrasr^c&atent' iuistiiigiiys^m^othenfroili of idata processing system. 

15 . . According to the present invention,, the feache ihech^sm ofrthe present invention^ ■* \ : 
includes a first cache for storing and providing data from files of less than a stored- predetermined 
threshold size.and a, second cache for storing and providing data from files of greater than the 
stored predetermined threshold size. The first cache includes a first cache controller for ^ 
controlling operations; of the first cache and a first cache memory for storing data of files smaller 

20 than the threshold., size and the second cache includes a second cache controller for controlling 
operations of the second cache and a- second cache memory for storing data of files greater than* 
the threshold size. The cache mechanism also; includes a file filter for storing a value representing 
the threshold size that is responsive to data : read requests for directing read requests for data of 
files less than the threshold sizeto the first cache controller and read 1 requests for data of files 

25 greater than the threshold size to the second cache controller. The cache mechanism will 
generally also include a mass-storage device connected from the first and. second, caches for 
stpring uncached data. \; \ •••»', • . . v v ;*■".■.: 

: In a further aspect of.the. cache mechanism, of the present invention* the first cache 1 
memory is further organized as a cache memory for storing and providing the data of files - 

30 smaller than the, threshold size and a pinned memory- for storing and providing the.data of a ^ 

pinned file wherein the pinned memory is not functionally separate from the cache memory but is 
effectively, contiguous with the cache memory wherein the parts of the cache memory, that 
comprise the pinned memory are identified by the state of the contents of those portions of cache 
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memory and wherein the date of a pinned file is locked from flushing from the pinned memory. 
According to this aspect of the cache mechanism, the cache mechanism further includes a cache 
manager for directing operations of th e cache mechanism, including storing a pinning threshold 
value representing a threshold frequency of access of files and a cache log for recording accesses 
5 to each file having data resident in the first and secc rd caches wherein the cache manager is 
responsive tc the frequency, of access of each file having ; data resident in the first and second 
caches for storing each file having a frequency of access greater than the threshold. frequency in 
the pinned memory. . ■• o..- -^.r.^ ;.. •• • ,• ■ . . .-j, r , .. .. . 

In a further embodiment of the cache mechanising the pinned memory and the cache . 

10 memciy are designated locations ina'small cache >rrielooiy and.a location of the smalli.cacke 

memory is designated as mme.piniiedjnemoryb^ in the location 

is stored, in^pinned table. Also, sin a yet. further aspect of the cache. mechanism, the cache . 
manager is responsive to the frequency of accesses of a file stored in the second . cache for 
transfeiTing-the file into the pinned memoiy,wte:;th«.fi^ is greater 

15 than the threshold fi-equency. " • . • ' 
Brief Description of the Drawings 

The foregoing and other objects, features and advantages of the present invention will be 
apparent from the following description of the invention and embodiments thereof, as illustrated 
in the accompanying figures, wherein: ;< :./ <•, • • 

20 Fig. lis a block diagi am of a content hosting system; -. • 

Fig. 2A is a block diagram of a content hosting system implemented as a network file 



server; 



Fig. 2B is a block diagram of a content deliveiy server in a network file ser ver; - 
Fig. 3 is a block diagram of a hierarchical content hosting system; * 
25 Fig. 4 is a block diagram of a content delivery server; ■ 

Fig. 5 

is a block diagram of a content delivery administrator »: .* 
Fig. 6 is a block diagram of a content request director; and • . 

Fig. 7 is a block-diagram of a content sensitive cache mechanism. 
Best Mode for Carrying Out the Invention . < ; - . 

30 A: Principles of Operation : .. . . 

As will be described m the following, .the content hosting environment system of the 
present invention is based on the principle that the requirements for hosting data, that is, storing, 
editing and managing data, such as the various forms of information contained in files, are 
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determined by the content of the data. In this regard, and for purposes of the present invention, 
content is essentially the type or formof data; contained in a file; an object or other form of body 
of data, such as, and for example, alphanumeric text ofsome particular form, such as a 
document, a spreadsheet, or a database record, graphic information, a web page, such as a 
5 hypertext markup language (HTML) page, and so on. 

The requirements for storing, ^editing or managing data, are essentially determined by,* and 
limited, for «ach particular data; content, that is, the particular type or form of data. For example, 
it is not necessary to provide graphics editing capabilities in a system functional element that is 
intended)to operate on text files and itiis likewise;unnecessaiy to provide text editing.functions in 

1 0 a system-ftuiGtionalielemehtihat dsiintended^or .theicreatron,and editing of-bitmapped graphics. 
Likewise, it is notnecessafy tofpEdviriei graphics editing* capabilities ina system ^element that. is . 
intended- only to: publish the igraphies data as the graphics data may be created and editing in a 
separate system ot elements ru V . . .. . 1 ; ' ; i 

In aOike mannep,^thfi^uirei^ 

1 5 are determined and limited, to a great extent, by the data content. For example, textifileS typically 
require less space for storing than do graphics files, so that a storage element or unit intended for 
text content may be designed or optimized to handle a relatively large number of smaller files 
while a storage element or unitcould;be designed to handle a smaller number of larger files. 
As will be described in the following, therefore, the system of the present invention 

20 provides an operating environment that is content sensitive, that is, hosts data according to the 
content of the data, so that each functional-element of the system and each group of cooperatively 
operating functional elements is designed to host, a particular content, or type or form of data. In 
a system intended to host a single content, therefore, each functional element of the system will 
be designed and optimized for data of that content, and the system may be expanded to host 

25 additional contents by the addition of further functional elements corresponding to and designed 
and optimized for each additional content. In this regard, it will be recognized that certain 
contents share common characteristics, so that at least some of the functional elements designed 
for and intended for use with one content will also be functional with one or more other contents. 
Therefore, and according to the present invention, each functional element of the system; 

30 is essentially designed as a single purpose element to host a single content and is thereby required 
to meet and provide only a limited and well defined set of requirements and functions, generally 
those functions necessary for publishing,. editing and managing the content. As such, each • 
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functional element can and typically will;be simpler, less complex, of lower cost, and simpler to 
operate and maintain than a corresponding functional element of a full, general purpose system. 

Such elements may be referred to as ^appliances" wherein the term "appliance" denotes a 
relatively less complex and expensive functional element that is relatively easy to install and 
5 maintain and is designed to provide a relatively limited range of functions for a single content, or 
a set of closely related contents. According to the present invention, a user may install an 
appliance, or set>of cooperatively operating appliances, for each specific content desired by the 
user, so that a system having a plurality of contents will belcomprised of corresponding single 
content appliances. ■; .u: . j . . : \ ■ yyv^'K , f ,, 

10 \ Having -described the basic principles*of operation of a. system according :totb.e present 

invention; the.following. will describe andmplementatibnlo^ in a 

system, such as a internet web server. In this regard, it will be noted that the detailed design, < 
construction and operation of many of the 'components of such a system: will be well understood 
by those of ordinary skill in the relevant arts, as wilkvaribiis alternate implementations and 
15 embodiments of such components::^ discussions will describe the 

components of a system according to the present invention only to the level of detail necessary 
for one. of ordinary skill in the relevant arts to implement such a system. 
B. Description of a Content Hosting Environment System 

1. General Description (Fig. 1) > > t :J 
20 Referring now to Fig. 1, therein is shown a generalized block diagram of a system, 

hereafter referred to as a Content -Hosting Environment (CHE) 1 0 of the present invention. 
Before beginning the following descriptions, it must be noted that the term "content" is used 
herein to refer both to the type, class or form of a body of data, such as a file or object, and to the 
data itself, or a body or data, thereby emphasizing, for; purposes of the following descriptions, the 
25 essential correlation or unity between a body of data and its content according to the basic 
principles of the present invention. ; 

As illustrated in Fig. 1, a CHE 10 is typically comprised of four types of components, or 
functional units. These components includeone or more Content Delivery Servers (CDSs) 12 
that are interconnected with a Content Delivery Administrator (CDA) 14 and, in those CHEs 10 
30 having two or more CDSs 12, a Content Request Director (CRD) 16. A CHE 10 will usually also 
include at least one Content Backup Server (CBS) 1 8. : 

As will be described further below,; each CDS 12 operates as a host for a corresponding 
content, that is, type or form of data, and stores aild provides that content to Content Publishers 
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20P and Content Requesters 20R, which interface respectively with the CDA 14 and the CRD 
16. In the general embodiment of a CHE 10 according to the present invention, therefore, a CHE 
10 thereby supports, two types or classes of user wherein Content Publishers 20P are generally 
those users permitted to add to or otherwise modifythe contents of a CHE 10 while Content 
5 Requesters 20R are generally those users permitted only to read or receive content from a CHE - 
10, although in- certain embodiments neither limitation need be enforced, or other limitations may 
be used. Also;- while separate interfaces and capabilities for two classes of -users is not necessary 
implemented in all embodiments of ?a- CHE 10,*his capability is particularly useful in, for 
example, network servers, such as World Wide Web servers. In this example, the publishers of 

1 0 web pages on a"given)GHE 10 .would function within tbe capacity of Content Publishers 20P 
while .web browsers accessingithe CHE 1 ft fromracrossthe internet? would Junction as Content 
Requesters 20R. >» *. .= ■ :■!»*• -..> v- : .• • . j < - i-\ . ■ 

; The CDA 14 r therefore provides a; unified interface for the Content Publishers 20P : - - - - 
through- whichrthe Content Publishers >20 GDSs 12. A.GDA 14 also, 

15 for example, monitors the operation of a CHE 10 to prdvide^load balancing among theiGQSsJS 
by controlling and managing *he distribution of content among the CDSs 12, for example, by 
selecting which CDS 12 is to receive and store new data and, in at least some embodiment,. by 
transferring content between the GDSs- 12. . ^ 

CRDs 16, in turn, receive all incoming requests from Content Requesters 20R and, based 

20 upon information stored therein provided from the CDA 14, identify the CDSs 12 containing, or 
hosting, the requested content and will forward the Content Requester 20R requests to the 
corresponding CDSs 12. As will be described further, below, the CRD 16 is not involved in the 
actual providing of content from a CDS 12 to a Content Requester 20R, as the requested content 
is, in general, provided directly from the corresponding CDSs 12 to the Content Requesters 20R. 

25 Finally, each CBS 1 8 provides backup and archival services for the CHE 10.- 

2. A CHE 10 As A Network Server (Figs. 2A, 2B and 2C).. 
Referring now to Fig.. 2Aj therein is illustrated an exemplary CHE 10 operating as a 
network server connected from a Communications Network* 22, such as a web server .connected 
from the internet.: As is, well known and understood, the World Wide Web, usually referred to as ■ 

30 "the Web", is constructed on the internet and a given site, or server, may have, or "host", 
literally thousands of hypertext markup language (HTML) pages, the location of each being 
identified by a corresponding universal resource locator (URLs) and each page frequently 
containing one or more URL's to other pages. As is well understood, each URL essentially points 
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to or is an address of a location of a body or- source of information on the internet and contains 
the information required to direct a TCP/IP based-protocol executing on a system to a particular 
location or resource on the internet and on intranets, such as a CHE 10, hosting the 
corresponding page. 

5 As represented in Fig. 2A, a typical web server CHE 1 0 will include one or more CDSs 

12, each of which is a web manager, or server, for storing and providing web pages in response 
to URLs received from a Content Requester 20R through Communications Network 22 and the 
CRD 16-. hi this example, and as generally illustrated in Fig, 2B, each CDS 12 may be comprised, 
for example, of the Netscape Navigatorprogram^e^u^i^'a-Pei^aal Computer (PC) 24 and 
10 may include a Mass Storage; Device (MSD) 26 such as a' disk drive dr equivalent form of mass 
storage deviee;/with;a£ontrolier 28 thatis^res^orisite ttfrsatr arid? write requests including 
identifications - of page stored or to be stored therein to read or write the corresponding pages 
from MSD 26 and to'provide the pages tO'Communications Network 22. lii other 
implementations a CDS l^may^e. comprised^ *£ :Xczied stiver typ^ dement, generally 
1 5 similar to ^h^m ge ^nms.st6r^b^Ux Executing a network server program written 
specifically for the CDS 1 2, but will appear generally as illustrated in Fig. 2B. 

As has been described, each CHE 10 containing two or more CDSs 12 will include a 
CRD 1 6 wherein the CRD 1 6 contains or stores information identifying, for the associated CHE 
1 0, the CDSs 1 2 storing particular content and, for this purpose, includes a Content Map (Map) 
20 30 identifying the CDS 12 hosting each content, such as a web page, hosted in the CHE 10. In 
the present example, and as illustrated in Fig. 2C, Map 30 contains an Entry 32 for and 
corresponding to each page hosted in a CDS 12 of the CHE 10 wherein each Entry 32 is indexed 
by the URL of the corresponding page stored in the CDSs 12. Each Entry 32, in turn, typically 
contains a CDS Identification (CDSI) 34A of the CDS 12 hosting the corresponding page. As is 
25 well understood in the art, each CDS 12 may typically store the pages in the form of files and 
each CDSI 34 may include a Content Location (CL) 34B identifying a location of apage in CDSs 
12, for example, a file and path name and track and sector identifiers, depending upon the 
specific implementation of the CDSs 12 and Controllers 28. 

As described, a CRD 16 will receive a request for content from a Content Requester 20R, 
30 will identify the CDS 12 in the CHE 10 hosting the requested content, and will forward the 
request to the proper CDS 12, which will respond to providing the requested content to the 
Content Requester 20R: CRDs 16 thereby cause the CDSs 12 of a CHE 10 to appear as a single, 
integrated CDS 12 to the Content Requesters 20R, so that Content Requesters 20R are not aware 
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of the internal structure or organization of a CHE 1 0. It will be recognized that theuse of a CRD 
16 in each CHE 10 containing more than one CDS 12 is of particular value in a World Wide ; 
Web type system wherein pages and portions of pages are linked together, through URL type 
identifiers, and wherein pages or portions thereof are identified for future reference by 
5 "bookmarking", that is, the storing of URLs. In particular, all references or links to pages or 
content in a given CHE 10; from outside the CHE 10, such as URLs bookmarked or otherwise 
stored by a Content Requester 20R,, essentially reference to the CHE 10's CRD 16, so that pages 
;or content hosted in theQDSs. 12 of the. CHE 40 : may be moved, edited, updated, added to, or . 
otherwise -modified without invalidating the re^^ . 

10 f , It, will also be recognized, jby those, of ordinary* skill in ; the relevantarts that the.Maps 30 . 
residing it) the CRDs 14 of CHEs IO hosting iCther :j contents willdiffe^ detail from that 
illustrated herein, but wall be generally. similar in structure and function. It .will also.-bej^ v . .. ; - * 
recognized by those, of ordinary, ski 11 in the relevant arts that a- particular content, that is, a ■ , 
particular body of daX&jOi file, may be cqm^^sed~cf :ijiultipte parts, such as a set or grpup of 

1 5 related contents, a multi-part file or set of lii^ed-Cles, and^that.the may reside, 

on separate CDSs 12v , : . ^ u • v 

As has been described, the CHE 10 also includes a CDS, 14 that provides a unified 
interface for Content Publishers 20P wherein, in this embodiment, Content Publishers 20P are . 
the users publishing web pages through the CHE 1 0. As described, the CDA 14 monitors the 

20 operation of the CHE . 10, such as the storage or hosting of web pages among the CDS 12 and, in 
this embodiment, most probably ; the number of requests serviced by each CDS 12, and balances 
the load among the CDSs 12 , by controlling and managing the distribution of content, that is, 
pages, among the CDSs 12. CDA 14 may do so by selecting which CDS 12 is to receive and 
store new data; and by transferring pages between the CDSs 12 as necessary to balance the 

25 servicing of requests by the CDSs 12. As will be discussed further below, CDA 14 generates and 
updates an original copy of Map 3.0 as the pages are stored in or transferred among the CDSs 12, 
and provides an,updated copy of the Map 30 to the CRD 16 for use by the CRD 16 in routing 
incoming requests from Communications Network 22. Finally, and as also described, the CBS 1 8 
provides backup, and archival services of the pages hosted by the CHE 10. , 

30 It will be recognized by those of ordinary skill in the relevant arts that a user, such as a 

business office, may install a number of different CHEs 10, each serving a different function ; or 
set of functions with respect to a corresponding content, or each -providing a different content. 
For example, the office may have a web server CHE 10 as described above, but may also, have 
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one or more CHEs 10 constructed to handle a particular text content, such as a database, or one 
or more graphics content CHEs .10. - 

3. A Hierarchical CHE 10 (Fig. ,3) ; 
Referring to Fig. 3, however, therein is illustrated an exemplary Hierarchical/Nested CHE 
5 (HCHE) 36 comprised of a plurality of hierarchical or nested CHEs 10, all of which may be 
constructed to host the same content or at least some of which may be constructed to host 
different contents. The HCHE 36 shown therein may also be geographically distributed, withthe 
CHEs 10 being distributed on a local, ^ 

HCHE 36 constructed according to Fig, t 3,;fq^^ CHEs.10 may function ' 

10 one the global. leyely;the,next :> tier on the regional ]pvel,{andth§. lqw£5| tier on-thejopal-or.^r : .. 
departmental level. This configuration will thereby support the_deske\of i fiertainxjients : to publish 
on a global scale while allowing other to publish.pnly on a< local }or department level, - 

As generally illustrated inFig. 3 v a HCHE36 ^yHl, generally be 
structure with a CHE 1 Q occupying. each ^^jyf^g^Q snd the hierareh^ai relationship . 

15 between the CHEs 10* such as that be^ CHEs 10, defined . 

according to the branches of the tree structure. At least some of the CHEs 10 of a HCHE 36 will 
generally be capable of acting as content distributors, wherein a content distributor CHE 10 is 
capable of replicating content to the subordinate CHEs 10 located on the nodes of the ^branches 
descending from the content distributor CHE 10. &s such, content published in.a distributor CHE 

20 10 ; may be replicated from that CHE 10 to the subordinate CHEs 10 located at the nodes of the 
branches descending from that distributer CHE 10. , , , : r * : . 

Again, the CRD 16 of each CHE 10 contains* hiformation identifying the locations of 
content in the associated content hosting system, so that a given .Content Requester 20R will not 
be aware of whether it is communicating with a single CDS 12 or multiple CDSs 12. In the 

25 instance of a HCHE 36 with a pattern of distributed or replicated content defined by the 

hierarchical tree structure of the HCHE 36, therefore, the access that a given Content, Requester 
20R has to published content will be determined by the CHEs 10: to which the Content Requester 
20R has access, so that different Contents Requesters 20R may be provided with different 
content according to the configuration of the HCHE 36 tree structure. For example, a publisher 

30 of an electronic magazine may be based in New York and may produce separate editions for the 
East and West coast regions of the United States. The publisher may therefore upload both 
versions into a global CHE 10 located in Chicago, Illinois and the two editions may be 
respectively replicated to regional CHEs 10 in, for example, Washington, D.C. and Los Angles, 
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California and, from there, to local CHEs 1C in the two regions. The local subscribers of the 
magazine would then access their regional editions through the lodal CHEs 10; If a particular 
subscriber located, for example, in the eastern region, wished to access the western edition, that 
subscriber would do so by obtaining access to a western local CHE 10. 
5 4. CDSs 12 (Fig. 4) 

Referring now to Fig. 4, therein is shown a generalized block diagram of a CDS 12. As 
has been described, a CDS' 12 serves incoming requests by providing the requested content to the 
requester, 1 whether a Content Publislier-20P of a Content Requester 20R. The r coritext of the CDS 
12, however^ that is^ configuration of the-s^eeifie'GHE 'lO or HtHE 36 in which the CDS -12 

10 resides, the<hierarchi<ial relationship of itfe'GHE VQ to other CHEs 10; and -the source of the. ; 

requests is trahsparem to~-thfe CDS^2.^' : nocq;;. • v^/.^i \i : ,1 v-t: r . i-.; ' ' . :•■ r: ... • 

As illustrated in -Fig. 4, a CDS 12 may ihclutfe ^Mass Storage Device (I&SD)- 3 8!for ! : 
storing cohteht aiid-a M&ftbry^40 f6f storing progtams for controlling a Processor 42 for 
executing requests fofr-dp&telicMs^^ with associated Memory 40. 

1 5 The operatidns performed by ProcesSbr 42 maty be fi^MWlhe Stdi-age <a£d reMe^al if tbhtent : 
to and from MSD 38, if the ] GDS 12 is primarily functioning as a cbnterit storage Component/ - 

'A CDS 12 'operates internally according to a Configuration 44, which contains parameters 
describing siich fuhctioits as the service level of the CHE 10,- an access log, and so on, including 
parameters used iri the general management of the operations of the CDS 12. Depending upon 

20 the specific design of the CDS : 12 and the environment in which it is operating, the Configuration 
44 may be provided, for example, as a separate file stored in MSD 38, a sub-tree in a Microsoft 
Windows NT directory, or as a file^thrbugh' RP© calls to an API. 

Each CDS 12 will also contain and maintain a sec of Log 46 files to track the request 
load, client trails, Which content^ or files!, where accessed or served, and so on. The Log 46 files 

25 are generally stored internally iri directories that are accessible; for example; to the CDA 14 for 
use in load balancing: It is apparerit that a CDS 1 2 may also include components such as a 
Interface Drivers (DRVRs) 48 if,*foi* example, the CDS 12 is required to provide content onto a 
network or to interface with other sources or destinations of content other than the CDA 14. - 

Finally, a given CDS 12 r may include Authorization Mechanisms (AMs) 50, to : assure that 

30 only authorized users can set or modify the Configuration 44. AMs 50 may also pro vide : general 
content security by performing requester or client authorization functions. - • - ' 



BMSDOCID: <WO 9903047 A 1_l_> 



WO 99/03047 

PGT/US98/14292 

14- 

Again, in otlier implementation a CDS 1 2 may be comprised of a dedicated server type 
element, generally similar to ah intelligent mass storage device executing a network server 
program written specifically for the CDS 12, but will appear generally as illustrated in Fig. 2B. 
5. CDAs M (Fig. 5) / 

5 Referring now to Fig. 5, a CD A 14 is generally implemented with a Processor 52 for 

controlling and executing CDA 14 operations «id aMemory 54 for storing content and programs 
for controlling the operations of Processor 52 and will generally include a Mass Storage Device 
(MSD) 56 for storing content and the programs, A CDA 14 may also include. Interface Drivers 
(DRVRs) 58 as necessary to physically interface witk Conteiif Bublishers 20P and may include 

0 Device Interfaces (DIs) 60.as;necessary.t6jtomm 12, th 3 C5LD I&ond the 

CBS 18. • ••• . . - ,, 

• •- • -■ *■ .'■•<■ ■ •i/o.jn.u: uti-i '>>.n-ji:.it aii; t . ... . ; . ~ 

As described, it is intended in the preferred embodiment of a CHEIO ofthe present 
invention that each CDA 14 or other element ofthe CHE 10 should be of limited and specific 
functionoHly*^^^^ 

5 elemeat of a CUE 1 0, such as sto p^lishk got managing the specific content. Accordingly, 
a CDA 14 performs three prima y functions, me first of which is to provide a "front end" and 
interface to Content Publishers 20P and to provide and receive content to and from Content , 
Publishers 20P, for example, through DRVRs 58. Secondly, a CBA 14 also stores a CHE . 
Configurauon 52 containing parameters defining, for example, the number, types and capacities 

0 of CDSs 12, levels of service, request loading, operational paramete-s, information pertaining to 
CRD 16, and soon. , v> . 

Finally, a CDA 14 monitors the level of acti vity by the CDSs 12 of the CHE 10 and stores 
content in the CDSs 12 and transfers content, between CDSs 12 in a manner to accommodate the 
desired request service levels. In association with-this function, and as described, a CDA 14 also : 
maintains a resident copy of the Map 30 identifying the locations of content in the CDSs 12,. 
updating the resident Map 30 and the CRD 16 resident copy of the Map 30 asnew content is 
added to CDSs 12 or moved between CDSs 12. The CDA 14 will also exchange content location 
information and configuration information with other CHEs 1 0;in a HCHE 36 in the same 
manner, updating the Map 30 and the CRD 16 resident copy of Map 30 as necessary for the 
routing of requests among the CHEs iO of a HCHE 36, as has bee* described. , 

It will therefore be apparent that the function of the CDA 1 4 in a CHE- 1 0 is to serve as 
the primary decision making and management component ofthe CHE 10, thereby removing all 
functions mat ere essential to the optimal functioning of the CHE 1 0, but which do not require 
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high levels of processing resources, from the request Servicing components, that is, the CDSs 12 
and CRD 16. The CD A 12 thereby frees the CDSs 12 and the CRD 16 to provide the maximum 
levels of request servicing. .• • '* 

The operations performed by a CD A 1 4 may also include the editing or modification of 
5 content and, if so, the CDA 14 will also provides the capability to create, edit or otherwise 

modify content. In most applications, however, content will be created and edited "off line", that 
is, externally to the CHE 10, and will be uploaded as necessary to the CHE 10, thereby avoiding 
extended intervals wherein the content of a CHE; 10 is changing and the risk of providing 
requester with .content that is- changing -from)acGess to laccess 'or even during an access. 

10 ■ . ; 'Finally- and again, 1 in other implementations a CDA 14 iaay.be comprised of a dedicated 
element designed for the specific purpose and functions necessary for its specific and limited 
operations, but will appear generally as illustrated in Fig. : 

- - ^CRIDs!l6<Figp6i)-\: = : • v. ■ ■■. 

Referring tfcFtg?tf/tte?^^ diagram of a CRD 16. As 

15 described above, there will be aCRDl^in each GME^lifrasntaiiung twoor more CDSs 12 and a 
CRD 16 receives requests from Content Requesters 20R and directs the requests to the CDSs 12, 
using the Map 30 for these purposes. As represented in Fig. 6, CRD 16 is based on a Processor . 
62 and a Memory 64 for storing programs controlling the operations of the Processor 62 and the 
Map 30. : ' • ■ - . : -u ■■ ■ ■ • ; 

20 Requests 64 are received through a Request Interface (RINT) 66, which may, for 

example, be a network interface component, and are parsed in a Request Parser (RPARS) 68 as*r 
necessary to extract the content identification from the Request 64, with the content identification 
then being used to index Entries 32 of the Map 30. The coiresponding location information 
identifying the location or locations of the . requested content is read from Map 30 and 

25 concatenated as necessary with, for example, any additional Request 64 information necessary . 
for the CDS 12 to execute the Request 64, and forwarded to the appropriate CDS 12 where the 
Request 64 is executed. As has been described, the requested content is returned to the Content 
Requester 20R directly from the CDS 12, rather than through the CRD 16, thereby freeing .the . 
CRD 16 to process ;the next request. . . . . 

30 Finally with regard to CRDs 1 6, it should be noted that many CHEs 1 0 and HCHEs. 36 

may replicate content across two or more CDSs 12 or CHEs 10, respectively, so that a request 
may be served from more than one location. In such implementations, the CRDs 16 will 
generally also include a Session Table (Session) 72 storing a Session Entry (SE) 74 for and 
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corresponding to each Content Requester, 20R during a moving time window, that is, over a ; 
period extending for a. predetermined interval into the past: Each such SE 74 each indexed, or 
identified, by the Content Requester 20R?s identification and will contain state information 
regarding a request by the corresponding Content Requester 20R during that time window, 
including an identification of the CDS 12 that serviced each previous request during that time 
window. Session 72 thereby assures that an<on-going transaction will complete properly, being 
serviced by the same CDS 12. c : . 

Finally, and again, in other implementations! a CRD ,16 may be comprised o.f ; a dedicated 
element designed for the specific purpose and functions necessary for; its. specific and limited 
operations, but will appear generally as illustrated in Fig. 6., i M 7 ; r > 

:.-7..CBSs.,18 • : , : :;,: V -r , • v-^/ ; ; , 

. Finally CBS J 8 will not be discussed in any further detail herein/as the design and. 
operation of such back-up servers is wellknown to those of .ordinary skill in the releyant.arts and 
the adaptation of such designs for, operation in asGHEiM will fee, well: understood by those of 
15 ordinaiy:sMh-xnthe^rclevan^a«si-:bt.0 5r J j '\-<>iu ■ ■-. 

Lastly, further details and discussions of an implementation of a CHE 10 of the present 
invention for use as an internet Web publishing server will be found in Appendix A, which : 
contains descriptions particularly pertaining to the translation or mapping of requests into the 
CDS 12 locations of the requested content by CRDs 16 and the containment of scripts to the 
20 executing CDS 12. ; , . , 

In closing with regard to the content hosting environment as described above, it must be 
noted, and reiterated, that Content Delivery Servers (CDSs) 12, Content Delivery Administrators 
(CD As) 14, Content Request Directors (CRDs) 16 and Content Backup Servers (CBSs) 18 are 
conceived and implemented, according to the present invention, as "appliances". According to 
the present invention, the term "appliance" denotes a relatively less complex and expensive 
functional element that is relatively easy to install and maintain and is designed to provide a 
relatively limited range of functions for a single, content, or a set of closely related contents. 
Further according to the present invention, a user may install an appliance, or set of cooperatively 
operating appliances, for each specific content desired by the user, so that a system having a 
30 plurality of contents will be comprised of corresponding single content appliances. 

As such, and as described, the "appliances" described above may be implemented or 
embodied in many alternate forms. For example, -each appliance, or some appliances, may be 
implemented as specifically designed and constructed dedicated purpose hardware and/or 
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software elements. Others may be implemented; as software;compone:ats on. individual or separate 
hardware systems, such as personal computers or servers, while in other instances the appliances 
or at least some appliances may be implemented as software-programs executing on a single, 
shared hardware component; again such as a personal computer or a larger computer or server. 
5 Also, the selection and number of different appliances comprising a given content hosting 
system, and their configuration, may differ significantly from system to system, and will be 
determined by the needs and functions of a particular system. For example, certain systems may 
net include a content request director iand others may include only a few content delivery servers 
or mariy^content delivery seivers: :^x;ii ^ /: i:r:., : ' , . 

10 C. A Cache Mechanism (Fig/?)?- i <n v;,;*:;;;;^: . , \ . v : ■ : r \ ' r ,i : 

As has been discussed herein above, the cache mechanism of the presenf invention 
recognizes that a^recurring problem in the cache mechanisms of the prior art <'is thatithe cache 
* mechanisms! of the prior art,essentially:treat all encached bodies of data, whether structured in, the 
form of CGntentpfiles^obj^^ formats *or; organizations,: as identical except for 

1 5 the frequency with which they are accessed. As such, the cachemeohanismsxrf the' !priorart have ' 
determined which data should be encached and which should be stored in slower mass.memory, 
and thus be .slower to access, almost entirely on the r frequency with which the data is used and 
have thus relied; for example, on least frequently used (LRU) algorithms to. determine which data 
should be encached in the cache memory and which data should be stored in a slower mass 

20 storage device. The cache mechanisms of the prior art have, as a consequence, generally 

encached very large files in the. same manner as smaller files/because, although they are generally 
less frequently, accessed than are Smaller files, they are accessed frequently enough to be 
encached according to the cache management algorithms commonly in use. 

The cache mechanism of the present invention, however, recognizes that the caching of 

25 data, whether in the fonri of files, content, objects, data blocks or other data formats, 1 hereafter 
referred to 'generally as "files'? solely for convenience and not in any form of explicit or implicit 
limitation on.data format or structure, is advantageously also determined by other characteristics 
of the files than only frequency >of use. For example, and as described above, virtually every 
system hosts files of significantly larger; size or extent than the average although such files are 

30 typically seldom accessed, often because of their size rather than because of their lack of . 

significance. Such files are frequently encached by the systems of the prior art, however, because 
the caches are generally managed by various frequency of use algorithms and, although less 
frequently accessed, they are accessed frequently enough to be encached according to the cache 
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management algorithms commonly in use. The encaching of large files, however, often results in 
the effective !oss of a significant part of the cachememory capacity that may be needed for 
caching smaller and less significant but more frequently accessed files. As a consequence, the 
entire system, or at least the caching mechanism, may essentially come to a halt or be 
5 significantly slowed because of the additional writing back, or flushing, and reloading of smaller, 
more frequently accessed files resulting from the use of cache memory capacity for very large 
files. : . .; v . v „ 

-The cache mechanism of the present inventkmalso recognizes^at the consents of at least 
some files is typically so significant-with regard to.fe>perati*is being performed by the -system, 
10 or by cne or more users- .that such files should be immediately accessible : at all tiiuesv As such 
such files S bould ; notb^subjected to the operati^of^r^y.used algorithm, butshould 
be encachtd M times, regardless of how. frequently or infrequently the files are accessed. 

As such, the cache mechanism of the present invention is acontent sensitive cache 
mechanism that encaches content according; to ; c^ , : 

than only the frequency with which particular content is requested. It will be apparent from the 
following descriptions, however, that the cache mechanism of the present invention is useful, and 
may be advantageously used, in systems other thanthe content hosting environment of the- 
present example. In a content hosting system such as that described herein, however, the cache 
mechanism of the present invention may be utilized, for example and typically, in the CDSs 12 
of a CHE 10 or, in certain circumstances, in the GRDs -16, particularly in a HCHE 3 6 wherein at 
least some of the CRDs 16 host information about the content residing in one or more other 
CHEs 10, as described herein above. ■ . , 

Referring to Fig. 7, therein is shown a block diagram of a Cache Mechanism 76 of the 
present invention. As represented therein Cache Mechanism 76 is provided with two primary 
25 cache memory mechanisms, the first being Cache 78 which includes a Cache Controller 80, a 
Cache Memory 82 and, typically but not necessarily, an associated Mass Storage Device (Mass . 
Store) 84A, such as a disk drive. The second cache mechanism is comprised of Large File Cache 
86, which includes a Large File Cache Controller 88 and Large File Cache Memory 90 and, 
typically but not necessarily, a Mass Storage Device (Mass Store) 84B, wherein Mass Store 84A 
30 and 84B may be separate devices or a single, shared device. As indicated, in the present 

embodiment, Large File Cache Memory 90 is-constructed of a plurality of Buffers 92 but, in 
alternate embodiments, may be constructed of a memory, such as Cache Memory 92, of any of a 
plurality of other memory structures. 
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As will be described further below, Cache. Mechanism 76 further includes a third, 
additional cache mechanism embodiedin a.Pinned Memory 94 andtwhich interoperates with 
Cache 78. As will also be described further An the following, in the presently preferred 
embodiment Pinned Memory 94 is not separate from Cache Memory 82 but is contiguous with 
5 and comprised of parts of :Cache Memory 82, wherein the parts ofCache Memory 82 that 

comprise Pinned Memory 94 are identified by the state of the contents of those portions of Cache 
Memory 82. In other embodiments, Pinned Memory 94 may be separate from Memory 82, or a . 
single physical Memory 96 .may organized into separate areas designed as.Cache Memory 82 and 
Pinned Memory 94, so, that Cache Memoiy 82 and Pinned Memory 94. share Memory 96. \ 

10 - ; As/indicated in Fig. :7,-. Cache Requests\98 for jdata to fee jread or written are received by a 

Cache Request Input ;1 00,. whileitte from ' 

Data Input/Output (I/O) 1 02r Data I/O 1 02 is in tum .<x>nnectedithrough.data paths <to facile • 
Memory 82,and. Large.. File Cache Memory 90 and, . indirectly therethrough,, to Mass Storage 
Devices 84A and 84B,^\yi{hitheda^ paths ;pas8m^throiigh». or controlled by, respectively, Cache 

15 Controller 80 and Large File Cache: Controller 88.- <:■.■.■ ><':inih ?;;a : : iv :jr^ - * 

Cache Request Input 1 00 passes the read/write requests to a. Cache Log 1 02, which 
records all cache, read/write requests for use in managing cache operations by Cache Manager 
104, and to a File Filter 106. File Filter 106, in turn, examines all read requests and, in particular, . 
compares the size of the file or other body of data to be stored with a File Size Threshold 108. 

20 File Size Threshold: 108 is, a value stored therein that represents a predetermined separating those . 
files that are judged to be "large" files and all ether, smaller, files. File Filter 1 06 then, based, 
upon the decision as to whether a particular file or other body of data is "large" or "not large" 
and under the direction of Cache Manager 1C4, directs each read request to Cache Controller 80 
when the file is judged "not large" or to Large File Cache Controller 88 when the file is judged 

25 "large", while Cache Manager. 1.04 . generates or updates a corresponding entry in Cache Log 

102. , < :. • .... . : . : . , ■ 

It should.be noted that the.threshold value stored in File Size Threshold 108 may vary , 
from system to system or may .vary with time. For example, Cache. Manager 104 monitors the , 
performance of Cache Mechanism 76 through the entries in Cache Log 102 and may . alter the . 
30 threshold value stored in File Size Threshold 108 depending upon the performance of the, ... 
mechanism, the sizes of the files or bodies of data presently typical for the system, the sizes of 
files most frequency accessed, the number and sizes of "large" files relative to the total number 
of files, and so on. - : 
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First considering the.case of a «large" file; Large F-le Cache Controller 88 responds to a 
file read request from File Filter 106 by reading the data.from one or more Buffers 92 of Large 
File Cache 90 if the data, is resident in Large File Cache 90. In the pr,sent embodiment, Large 
File Cache 90 is a predetermined amount of private memory organized, as described above as 
5 Buffers 92. 

As is typical and !C ommonl y understood in cache mechanisms, the requested data may not 
be resident, in the Large File Cache 90 and m„st thereby b, read from mass storage, either 
directly or through Large File Cache 90, If me data is ;read . through: Large File Cache 90 or if the 
data is to be encached. jn : Large File Cache 90,, unnecessary*, determine which of Buffer 92 is 
10 to receive data,m the. P resent embody 

frequeratly used algorithm to,detennine ; wiuch^ if necessary. Ln this regard 

Fde Filter 106 operates with Large.File Cache Controller 88, and in a like manner with Cache 
Controller-80 " to invalidate entries in Large File Cache 8.6 and.in Cache.78 as neeessary; as. when 
data encached therein is removed or. overwritteni>.:'.n ; nr.i*. i.-i ;v. - ; , .... 
15 Also, it should be noted that in the presently preferred.embodiment, Cache.Mariager 104 

tracks the frequency of access of large files encached in Large File Cache 86, using the 
information stored in Cache Log 102, and may transfer a large file to Cache 78 or even into - 
Pinned Memory 94, as described below, if a large file is accessed with sufficient frequency, 
In summary, therefore, each read request is received by Cache Request Input 100 and 
passed to Cache Log 102, whereupon it is read by Cache Manager 104 which, in turn, determines 
whether the corresponding data is ; stored in Cache 78 or Large File Cache 86, If the request is for 
a large file encached in Large File Cache 86, the request is passed to File Filter 106 which in turn 
da-ects Large File Cache 86 to provide the data through Data I/O 102, .whereupon Large File 
Cache Controller 88 reads the data from Large File Cache Memory 90 to Data I/O 102, first 
loading the data from Mass Storage Device 84B to Large File Cache Memory 90, if necessarv. 

It will be apparent, therefore, that the provision^ a large file cache separate, from the' 
cache for smaller-files according to the content sensitive principles of the present invention there 
avoids interference between these generally less frequently accessed: large files and the more 
frequency accessed smaller files handled in Cache 78, Assuch, smaller files are not displaced 
30 from the cache mechanism by a relatively few larger files, or, even a single,large file, so that the 
performance of the cache mechanism as regards the more frequently accessed smaller files is 
enhanced. 
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The performance of the cache mechanismifor larger files* moreover, -is generally 
comparable to or better than that of conventional cache mechanisms' ds the large file cache - 
mechanism likewise prevents small file operations from interfering with -the accessing of large 
files. In this regard, it should be noted that Large File Cache 78 is essentially embodied as a 
5 buffer for reads from the mass storage devices containing the large files. As such, the capacity of 
Large File Cache 78 may typically be sufficient to contain at least a portion of a relatively large 
file and may be sufficient to contain a relatively small number of large files or portions thereof, 
and that the capacity of Large File^Gache'-J& may be increaSed-aS necessary- or desired. It will 
apparent that it will be frequeiitly i^cessary-foriLarg^ File: Cache 78 to l<5ad large files or portions 

1 0 thereof :frdm massstoraige; It has been 1 found,"howeverjthat this 'does isot' adversely affect the 
performance of the system as^the laifeeVfiles kref accessed' relatively infrequentlyy sb that the time 
required, for loading of large file data from mass storage is acceptable. Also, it is mbre frequently 
the case that there will -be. many successive accesses of a given large file, thereby allowing the 1 
prefetching of files pages in anticipation of need^arid'thatthe flushing (rf an entire large file to 

15 load a different file will be a relatively infrequent events n? te& ->V:,.; ~.n hd)c:lz .-, 

Briefly considering the operation of the cache mechanism . for smaller files, that is, files 
that are not "large", Cache 78 operates in the same manner as described above with regard to 
Large File-Cache; 86, except that Cache 78 ehcaches files, or other bodies of data, that are not 
judged as "large''* according *to the value currently stored in File Size Threshold 108 and therefore 

20 need. not be described in further detail herein. ^ - * f 

; As has been discussed above, however, Cache Mechanism 76 recognizes that certain data 
is typically of such significance to the system or to one or more users or is accessed so frequently 
that the data should preferably reside in cache memory at all times to be rapidly accessible at all 
times. For this reason, Cache Mechanism 76 includes a Pinned Memory 94 for storing such data 

25 wherein Pinned Memory 94 is designated as "pinned" in that the data stored therein* may not be 
flushed or transferred out, for example, to a Mass Storage Device 84, or invalidated, in the 
normal operation of Cache Mechanism 76, but only in a specific administrative operation outside 
of the normal operations of Cache Mechanism 76. 

As has been described, 'in the presently preferred embodiment Pinned Memory 94 is not : 

30 physically or organizationally separate from Cache Memory 82. Instead, Cache Memory 82 and 
Pinned Memory 94 are comprised of contiguous parts of Memory 96: As such, those parts of 
Memory 96 that comprise Cache Memory 82 and Pinned Memory 94 are identified and 
organized from one another dynamically and by the state of the contents of each part of Memory 
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96, that is, by whether the.contents of a given, file residing in Memory 96 are "pinned" or 
"unpinned". As such, it is^ot necessary to physically transfer data between Cache Memory 82 
and Pinned Memory 94, but only .to track the state of each file resident in Memory 96 Cache 
Manager 104 will, in general, only lock a file to "transfer" the file from Cache Memory 82 to 
Pinned Memory 94, and the cache will find the pinned pages when the cache defaults. 

For this purpose, Cache Manager 104 maintains a Pinned Table 110 in Cache Log 102 
wherein Pinned Table 110 contains a Pinned Entry 1 12 for and corresponding to each file or 
body of data that has been designated as "pinned" and thereby "residing" in Pinned Memory 94 
Cache Manager 104 may generate a Pinned Entry 1 12, for example, when directed to do so by a 
user, such as a system administrator, or when the frequency of access of a particular file or body 
of data exceeds a threshold value stored in a Pinning Threshold 1 14 stored in Cache Manager 
104, whereupon that data may be transferred from, for example, Cache 78 or Large File Cache 
86, to Pinned Memory 94. 

As has been described, in other embodiments, Pinned Memory 94 may be separate from 
Memory 82, or a single physical Memory 96 may organized into separate areas designed as 
Cache Memory 82 and Pinned Memory 94, so that Cache Memory 82 and Pinned Memory 94 
share Memory 96. 

Because it is the general case that the most frequently accessed files are smaller files that 
is, those files other than those designated as "large" and therefore those files normally resident in 
Cache 78, most files that become designated as "pinned" files and "transferred" into Pinned 
Memory 94 will be of files from Cache 78. In addition, and as discussed above, a large file 
having a sufficiently high frequency of access may be designated as a "pinned" file As such 
large files having a sufficiently high frequency of access will generally first be transferred from 
Large File Cache 86 to Cache 78 as their frequency of access increases, so that again "transfers" 
of files to Pinned Memory 94 will generally be from Cache 78 rather than directly from Large 
File Cache 86. Again, and for this reason in the presently preferred embodiment, Cache Memory 
82 and Pmned Memory 94 presently share a single Memory 96 in the preferred embodiment of 
Cache Mechanism 76 so that a given file or body of data may be designated as "pinned" and 
effectively "transferred" from Cache Memory 82 to Pinned Memory 94 solely by the generation 
30 of a corresponding entry in Pinned Table 1 1 0. 

Finally, while the invention has been particularly shown and described with reference to 
preferred embodiments of the apparatus and methods thereof, it will be also understood by those 
of ordinary skill in the art that various changes, variations and modifications in form, details and 
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implementation may be made therein without departing from the spirit -and scope of the invention 
as defined by the appended claims. Therefore, it is the object of the appended claims to cover all 
such variation and modifications .of the invention as come within the true spirit and scope of the 
invention. - 
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Claims: 

1 . A content hosting system for storing and publishing content, comprising: 

a content delivery server for storing and providing content including data of a 

corresponding type, ; . t 

5 . each content delivery server being accessible tc content requester for providing 

the content stored therein to the content requester^and 

a content delivery administrator connected with each contentdelivery server and 
accessible to at least one content publisher, ... : , , . 

the content delivery administrator being responsive to a request b>U content . 
10 publisher fcr'storing content providedfby the content publisher^* content delivery server to be 
provided to a content requester. - T . J3 , ;1 ,:, v , > n ^ llri ... ....... 

2. The content hosting system of claim 1, wherein:- . ,,• ;. , 

the content hosting system includes >.-„...• , . < 

' '■! two or. more content deliverj'isereersi;and /ibi /; v iff ri;- 
15 r , , ; a content request director." connected with each content delivery server and 

accessible to each requester for identifying the locations of content in each content delivery 
server, 

the content request director containing information identifying the content 
stored in each content delivery server and being responsive to a request for content from a 
20 content requester for forwarding the request to the contentdelivery. server hosting the requested 

content, 

the content delivery server hosting the requested content being responsive to the 
request for providing the requested content to the content requester. , 

3. The content hosting system of claim 1, wherein:: ; ;.-„ , , ,• ,,. ,., , 
the content delivery administrator is responsive, to requests of content requesters and 

content publishers for managing the storing of data in the content delivery servers to balance the 
load ofrequests among the content delivery, servers. ; : 

4. The content hosting system of claim Vforther comprising:- ,., . , 

a content backup server connected with each content delivery administrator for storing a 
30 copy of the data stored in each content delivery server and for providing the stored data to the 
content delivery servers. 

5. The content hosting system of claim 2, further comprising a network server wherein: 
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the content request director is connected from a network and to the content delivery 
servers for providing requests from content .requesters-connected to the network to the content 
delivery servers. ' • * ■ *-' ^ - 

6. A hierarchical content hosting system for storing and operating on data, comprising: 

5 a plurality of content hosting systems organized into a tree structure wherein each node of 

the tree includes a content hosting system having a hierarchical relationship defined by the tree 
structure, each content hosting system including- L 

a content delivery server for storing arid providing content including data 
of a coTrespondihg ; type,- s '*> * jf.rKv^s a^r' •louvji.-.njr'L--. r^:.* 
10 - : * n$*. n each^content delivery .server being accessible tbi accontent requester for providing 

the content stored therein to the content requester, and .^.-^ir.'.-- r . * . b . 

a content delivery administrator connected with each content delivery &erVer 'and s ' * 
accessible to at least one content publisher, - ^ . c .: 

the content delivery administrator-ibein:g^respbnsive<to arrequestby a content 
1 5 publisher for storing content provided by-the content publisher iri^a> content delivery server to be 
provided to a content requester. 

7. The content hosting system of claim 7, wherein: 

at least one content hosting system includes 

two or more content delivery servers, and 
20 a content request, director connected. with each content delivery server, and 

accessible to each requester for identifying the locations of content in each content delivery 
server, * . . ; . • 

the content request director containing information identifying the content 
stored in each content delivery server and being responsive to a request for content from a 
25 content requester for forwarding the request to the content delivery 1 server hosting the requested 
content, ■ . '\ ; - ■■■./.•' : i ' -r 

the content delivery server hosting the requested content being responsive to the 
request for providing the requested content to the content requester. 

8. A cache mechanism for use in a data processing system, comprising: 

30 a first cache for storing and providing data from files of less, than a stored predetermined 

threshold size, including 

a first cache controller for controlling-operations of the first cache, and 

a first cache memory for storing data of files smaller than the threshold size, 
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a second cache for storing and providing.-data from flies of greater than the stored 
predetermined threshold size, including . ...... 

a second cache controller for controlling operations of the second cache, and . 
a second cache memory for. storing data of files greater than the threshold size 

5 and 

a file filter for storing a value representing the threshold size and responsive to dam read 
requests for directing read requests for data of files less than the threshold size to the firs, cache 
controller and read requests for data of files greater than the threshold size to the second cache ' 
controller, 

10 the first and second cache controllers being responsive to read requests for 

providing the requested data. 

9. The cache mechanism of claim 8, further comprising: 

a mass storage device connected from the first and second caches for storing uncached 

data. 

15 10. The cache mechanism of claim 8, the first cache memory further comprising: 

a cache memory for storing and providing the data of files smaller than the threshold size 

and 

a pinned memory for storing and providing the data of a pinned file wherein the data of a 
pinned file is locked from flushing from the pinned memory. 
11. The cache mechanism of claim 10, further comprising: 

a cache manager for directing operations of the cache mechanism, including 

storing a pinning threshold value representing a threshold frequency of access of 

riles, and 

a cache log for recording accesses to each file having data resident in the first and second 

the cache manager being responsive to the frequency of access of each file having 
data resident in the first and second caches for storing each file having a frequency of access 
greater than the threshold frequency in the pinned memory. 
12. The cache mechanism of claim 1 1, wherein: 

the pinned memory and the cache memory are designated locations in a small cache 
memory, and 

alocati °™fthesm^ 
that the data residing in the location is stored in a pinned table. 
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13. The cache mechanism of claim 10, wherein: r * - 

the cache manager is responsive to the frequency of accesses of a* file stored in .the second 
cache for transferring the file into the pinned memory when the frequency of accesses of the file 

is greater than the threshold frequency. 
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